深度学习的显着成功引起了人们对医学成像诊断的应用的兴趣。尽管最新的深度学习模型在分类不同类型的医学数据方面已经达到了人类水平的准确性,但这些模型在临床工作流程中几乎不采用,这主要是由于缺乏解释性。深度学习模型的黑盒子性提出了制定策略来解释这些模型的决策过程的必要性,从而导致了可解释的人工智能(XAI)主题的创建。在这种情况下,我们对应用于医学成像诊断的XAI进行了详尽的调查,包括视觉,基于示例和基于概念的解释方法。此外,这项工作回顾了现有的医学成像数据集和现有的指标,以评估解释的质量。此外,我们还包括一组基于报告生成的方法的性能比较。最后,还讨论了将XAI应用于医学成像以及有关该主题的未来研究指示的主要挑战。
translated by 谷歌翻译
主动感知和凹觉视觉是人类视觉系统的基础。虽然动脉凹视觉减少了在注视期间要处理的信息的量,但主动感知会将凝视方向转变为视野中最有前途的部分。我们提出了一种方法,以模仿人类和机器人使用中央摄像机探索场景,并以最少的凝视转移来识别周围环境中存在的物体。我们的方法基于三种关键方法。首先,我们采用现成的深度对象检测器,并在大量的常规图像数据集上进行了预训练,并将分类输出校准为foveateat图像的情况。其次,考虑了几种数据融合技术,对对象分类和相应的不确定性编码对象分类和相应的不确定性进行了依次更新。第三,下一个最好的目光固定点是基于信息理论指标确定的,旨在最大程度地减少语义图的总预期不确定性。与随机选择的下一个凝视转移相比,提出的方法可以使检测的F1分数增加2-3个百分点,以相同数量的凝视偏移,并减少三分之一,而三分之一则是所需的凝视转移数量以达到相似的性能。
translated by 谷歌翻译
本文档描述了基于深度学习的点云几何编解码器和基于深度学习的点云关节几何和颜色编解码器,并提交给2022年1月发出的JPEG PLENO点云编码的建议。拟议的编解码器是基于最新的。基于深度学习的PC几何编码的发展,并提供了呼吁提案的一些关键功能。拟议的几何编解码器提供了一种压缩效率,可超过MPEG G-PCC标准和胜过MPEG的效率,或者与V-PCC Intra Intra Interra Interra Intra标准的竞争力均超过了jpeg呼叫提案测试集;但是,由于需要克服的质量饱和效应,关节几何和颜色编解码器不会发生同样的情况。
translated by 谷歌翻译
Advances in image processing and analysis as well as machine learning techniques have contributed to the use of biometric recognition systems in daily people tasks. These tasks range from simple access to mobile devices to tagging friends in photos shared on social networks and complex financial operations on self-service devices for banking transactions. In China, the use of these systems goes beyond personal use becoming a country's government policy with the objective of monitoring the behavior of its population. On July 05th 2021, the Brazilian government announced acquisition of a biometric recognition system to be used nationwide. In the opposite direction to China, Europe and some American cities have already started the discussion about the legality of using biometric systems in public places, even banning this practice in their territory. In order to open a deeper discussion about the risks and legality of using these systems, this work exposes the vulnerabilities of biometric recognition systems, focusing its efforts on the face modality. Furthermore, it shows how it is possible to fool a biometric system through a well-known presentation attack approach in the literature called morphing. Finally, a list of ten concerns was created to start the discussion about the security of citizen data and data privacy law in the Age of Artificial Intelligence (AI).
translated by 谷歌翻译
This paper presents a machine learning approach to multidimensional item response theory (MIRT), a class of latent factor models that can be used to model and predict student performance from observed assessment data. Inspired by collaborative filtering, we define a general class of models that includes many MIRT models. We discuss the use of penalized joint maximum likelihood (JML) to estimate individual models and cross-validation to select the best performing model. This model evaluation process can be optimized using batching techniques, such that even sparse large-scale data can be analyzed efficiently. We illustrate our approach with simulated and real data, including an example from a massive open online course (MOOC). The high-dimensional model fit to this large and sparse dataset does not lend itself well to traditional methods of factor interpretation. By analogy to recommender-system applications, we propose an alternative "validation" of the factor model, using auxiliary information about the popularity of items consulted during an open-book exam in the course.
translated by 谷歌翻译
Real-world robotic grasping can be done robustly if a complete 3D Point Cloud Data (PCD) of an object is available. However, in practice, PCDs are often incomplete when objects are viewed from few and sparse viewpoints before the grasping action, leading to the generation of wrong or inaccurate grasp poses. We propose a novel grasping strategy, named 3DSGrasp, that predicts the missing geometry from the partial PCD to produce reliable grasp poses. Our proposed PCD completion network is a Transformer-based encoder-decoder network with an Offset-Attention layer. Our network is inherently invariant to the object pose and point's permutation, which generates PCDs that are geometrically consistent and completed properly. Experiments on a wide range of partial PCD show that 3DSGrasp outperforms the best state-of-the-art method on PCD completion tasks and largely improves the grasping success rate in real-world scenarios. The code and dataset will be made available upon acceptance.
translated by 谷歌翻译
Optical coherence tomography (OCT) captures cross-sectional data and is used for the screening, monitoring, and treatment planning of retinal diseases. Technological developments to increase the speed of acquisition often results in systems with a narrower spectral bandwidth, and hence a lower axial resolution. Traditionally, image-processing-based techniques have been utilized to reconstruct subsampled OCT data and more recently, deep-learning-based methods have been explored. In this study, we simulate reduced axial scan (A-scan) resolution by Gaussian windowing in the spectral domain and investigate the use of a learning-based approach for image feature reconstruction. In anticipation of the reduced resolution that accompanies wide-field OCT systems, we build upon super-resolution techniques to explore methods to better aid clinicians in their decision-making to improve patient outcomes, by reconstructing lost features using a pixel-to-pixel approach with an altered super-resolution generative adversarial network (SRGAN) architecture.
translated by 谷歌翻译
Using Structural Health Monitoring (SHM) systems with extensive sensing arrangements on every civil structure can be costly and impractical. Various concepts have been introduced to alleviate such difficulties, such as Population-based SHM (PBSHM). Nevertheless, the studies presented in the literature do not adequately address the challenge of accessing the information on different structural states (conditions) of dissimilar civil structures. The study herein introduces a novel framework named Structural State Translation (SST), which aims to estimate the response data of different civil structures based on the information obtained from a dissimilar structure. SST can be defined as Translating a state of one civil structure to another state after discovering and learning the domain-invariant representation in the source domains of a dissimilar civil structure. SST employs a Domain-Generalized Cycle-Generative (DGCG) model to learn the domain-invariant representation in the acceleration datasets obtained from a numeric bridge structure that is in two different structural conditions. In other words, the model is tested on three dissimilar numeric bridge models to translate their structural conditions. The evaluation results of SST via Mean Magnitude-Squared Coherence (MMSC) and modal identifiers showed that the translated bridge states (synthetic states) are significantly similar to the real ones. As such, the minimum and maximum average MMSC values of real and translated bridge states are 91.2% and 97.1%, the minimum and the maximum difference in natural frequencies are 5.71% and 0%, and the minimum and maximum Modal Assurance Criterion (MAC) values are 0.998 and 0.870. This study is critical for data scarcity and PBSHM, as it demonstrates that it is possible to obtain data from structures while the structure is actually in a different condition or state.
translated by 谷歌翻译
Osteoarthritis (OA) is the most prevalent chronic joint disease worldwide, where knee OA takes more than 80% of commonly affected joints. Knee OA is not a curable disease yet, and it affects large columns of patients, making it costly to patients and healthcare systems. Etiology, diagnosis, and treatment of knee OA might be argued by variability in its clinical and physical manifestations. Although knee OA carries a list of well-known terminology aiming to standardize the nomenclature of the diagnosis, prognosis, treatment, and clinical outcomes of the chronic joint disease, in practice there is a wide range of terminology associated with knee OA across different data sources, including but not limited to biomedical literature, clinical notes, healthcare literacy, and health-related social media. Among these data sources, the scientific articles published in the biomedical literature usually make a principled pipeline to study disease. Rapid yet, accurate text mining on large-scale scientific literature may discover novel knowledge and terminology to better understand knee OA and to improve the quality of knee OA diagnosis, prevention, and treatment. The present works aim to utilize artificial neural network strategies to automatically extract vocabularies associated with knee OA diseases. Our finding indicates the feasibility of developing word embedding neural networks for autonomous keyword extraction and abstraction of knee OA.
translated by 谷歌翻译
Objective: Accurate visual classification of bladder tissue during Trans-Urethral Resection of Bladder Tumor (TURBT) procedures is essential to improve early cancer diagnosis and treatment. During TURBT interventions, White Light Imaging (WLI) and Narrow Band Imaging (NBI) techniques are used for lesion detection. Each imaging technique provides diverse visual information that allows clinicians to identify and classify cancerous lesions. Computer vision methods that use both imaging techniques could improve endoscopic diagnosis. We address the challenge of tissue classification when annotations are available only in one domain, in our case WLI, and the endoscopic images correspond to an unpaired dataset, i.e. there is no exact equivalent for every image in both NBI and WLI domains. Method: We propose a semi-surprised Generative Adversarial Network (GAN)-based method composed of three main components: a teacher network trained on the labeled WLI data; a cycle-consistency GAN to perform unpaired image-to-image translation, and a multi-input student network. To ensure the quality of the synthetic images generated by the proposed GAN we perform a detailed quantitative, and qualitative analysis with the help of specialists. Conclusion: The overall average classification accuracy, precision, and recall obtained with the proposed method for tissue classification are 0.90, 0.88, and 0.89 respectively, while the same metrics obtained in the unlabeled domain (NBI) are 0.92, 0.64, and 0.94 respectively. The quality of the generated images is reliable enough to deceive specialists. Significance: This study shows the potential of using semi-supervised GAN-based classification to improve bladder tissue classification when annotations are limited in multi-domain data.
translated by 谷歌翻译